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SYSTEM AND METHOD FOR INTERNATIONALIZING THE CONTENT OF 
MARKUP DOCUMENTS IN A COMPUTER SYSTEM 

[0001] The present invention concerns a method for internationalizing the content 
5 of markup documents in a computer system, and more particularly the content of pages on 
the web (more commonly called web pages in computer literature), as well as a system for 
implementing this method. 

Prior Art 

[0002] The present invention relates to the internationalization of markup 
documents. 

[0003] The term documents is intended in the broad sense, i.e., a text, a sound 
extract, a video document, a program, or any other type of information medium or 
1 5 combinations of such mediums. 

[0004] A markup document is a document that includes tags or markers (both 
terms are used in computer literature), i.e., special codes that control, in particular, the 
structure and/or the appearance of the documents in the software using them. 

[0005] The present application describes the example of a markup page on the 
20 Web, i.e. a computer document such as, for example, a text file, an image, or a video into 
which have been inserted special codes (the tags) that control the structure, the 
appearance, the dynamic behavior, etc., of the page in the software for navigating on the 
Web (commonly called a Web browser in computer literature). A Web browser is a piece 
of software used to present a document to a user, and to keep track of the relationships 
25 established between this document and other documents by means of links on the Web 

(commonly called Web links in computer literature). A Web link is a reference that makes 
it possible to design an access protocol, a host system, an access path in this system, and 
possibly an anchor, thus making it possible to access a document or one of its parts. 
[0006] Today, in the majority of cases, Web pages are created using markup 
30 languages. The most commonly used language is HTML (Hyper Text Markup Language). 
Other languages are beginning to be used, such as XML (eXtended Markup Language), 
but they are essentially very similar to HTML. 

[0007] The internationalization of a document consists of allowing and facilitating 


.1 o on; :ii..<yi «-:;« E: ,., o.:L Ft -vo >a 


the localization of said document into a given language or culture. The localization of a 
document is the procedure that consists of implementing means for transcribing said 
document into a given language or culture. Internationalization concerns, for example, the 
translation of text, sound and/or video messages, etc., the transformation of typed 
5 elementary data (dates, numbers, monetary values, etc.), concept representation 

(representation of an icon of the "DANGER panel" type in the routing code), information 
sorting (information sequencing), encoding (digital translation of a piece of information 
into a given format), and the manipulation of information (the manipulation of character 
sets): concatenation operations, capitalization, etc.), etc. 

10 [0008] It is important to note what affects the presentation of a document has more 

to do with its personalization than its localization. For example, the choice of colors, the 
character fonts or character sizes, the layout of the paragraphs, etc., is generally not part of 
the internationalization/localization. On the other hand, certain aspects of the rendition of 
a document, like accommodating the direction in which texts are read, which creates 

1 5 problems in framing, positioning action buttons, etc., are internationalization problems. 

[0009] In the case of software internationalization, localization is made necessary 
by the expansion of markets (increasing foreign sales), by client or even legislative 
requirements for using software and documents in one's native language, and by 
constraints related to integration, maintenance, confidentiality or patrimonial protection. 

20 Moreover, software designers do not want to handle the dissemination of the sources of 
their software, explain to third parties the places in which messages must be modified, 
provide support for the errors resulting from these modifications, reveal trade secrets, etc. 
the localization must avoid the recompilation or delivery of sources. 

[0010] Nowadays, there is no solution that handles the internationalization of the 

25 content of Web pages. In general, Web page providers simply duplicate the entire page 
and completely replace the content to be localized, in general manually. 
Linguistic/cultural experts are required to know the formatting language of the 
documents, for example the HTML language, and its subtleties, or use HTML page 
editors. In any case, they are required to have the pages in their entirety, and hence all of 

30 the HTML elements, in order to be able to work on them. 

[0011] One problem posed by the invention is for a software editor to be able to 
internationalize computer (software or other) documents or to offer his clients Web pages 
that can be internationalized while avoiding any client involvement in the localization 
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process. Localization must avoid the need to translate the document, and for example all 
of the pages on the Web, into all of the languages. 

[0012] Another problem is the growing complexity of the HTML language, and of 
structuring and formatting languages in general, the sophistication of Web page content 
5 (the content of the pages is becoming increasingly rich, with a growing number of 

presentation gimmicks), and the use of advanced (particularly HTML) editors that require 
more and more expertise on the part of translators. 

[0013] One object of the present invention consists of allowing markup documents 
to be localized without any user intervention. 
10 [0014] Another object consists of modifying the editors to allow the 

internationalization of markup pages. 

[0015] Another object of the present invention consists of facilitating localization 
operations in a computer system, in particular by avoiding the need to recompile it or to 
deliver its sources. 

15 

Summary of the Invention 

[0016] In this context, the subject of the present invention is a method for 
internationalizing the content of markup documents, which consists of: 
20 • detecting a tag to be used in the localization of the document, one or more localization 
attributes, and possibly a default localization value associated with said tag by means 
of a localization tool; 

• searching, if necessary, in storage means in a translation file, for the localized value of 
the element associated with this or these localization attribute(s); 

25 • replacing the tag in the document with the localized value found in the translation file, 
or with the default localization value, or with a value obtained via automatic 
transcription functions. 

[0017] The present invention also concerns a system for internationalizing the 
content of markup documents, comprising: 
30 • means for storing markup documents; 

• means for storing translation files for the documents; 
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• a localization tool connected to said storage means and allowing the content of the 
document to be localized using the translation file. 

[0018] The present invention also relates to a method for editing and 
internationalizing markup documents that consists, each time during the editing of the 
5 document (8) that a user enters content to be internationalized, of associating localization 
attributes with said content, proposing the entry of a default value of the content to be 
internationalized, and proposing the entry of all or some of the various values assumed by 
this content in the various target languages of the document being edited, of creating the 
document and the associated translation files from information obtained from the user, 
1 0 and storing said files in storage means. 

[0019] The present invention concerns an editing and internationalization system 
comprising an editor in a machine for editing markup documents, which makes it possible 
to create reference files and associated translation files from information obtained from 
the user and store them in storage means. 

15 

Presentation of the Figures 

[0020] Other characteristics and advantages of the invention will become clear in 
light of the following description, given as an illustrative and non-limiting example of the 
20 present invention, in reference to the attached drawings in which: 

• Fig. 1 is a schematic view of an embodiment of the internationalization system 
according to the invention; 

• Fig. 2 is a schematic view of an embodiment of an editor that allows the 
internationalization according to the invention; 

25 

Description of an Embodiment of the Invention 

[0021] As shown in Fig. 1, which illustrates an embodiment of the 
internationalization system according to the invention, a computer system 1 is distributed 
30 and composed of machines 2-4 organized into one or more networks 5. A machine is a 

very large conceptual unit that includes both hardware and software. The machines can be 
quite diverse, such as workstations, servers, routers, specialized machines and gateways 
between networks. A machine comprises at least one processor, at least one memory, and 
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possibly one or more peripherals. Only the components of the machines of the system 1 
that are characteristic of the present invention will be described, the other components 
being known to one skilled in the art. 

[0022] It should be noted that the machines 2-4 can be grouped with one another 
5 in various ways and can, for example, constitute one and the same machine. 

[0023] Fig. 1 represents an exemplary embodiment of the internationalization 
system according to the invention. 

[0024] The internationalization system according to Fig. 1 comprises a reference 
machine 2, a translation machine 3, and a localization machine 4. 

10 [0025] The reference machine 2, in the example illustrated in Fig. 2, is connected 

to a network 6 of machines, such as for example the Internet. The reference machine 2 in 
the embodiment illustrated in Fig. 1 is a server for accessing the network 6. Pages, HTML 
pages in the example illustrated, are hosted by the server 2 and are retrieved by means of a 
file transfer protocol or by means of a Web page server interrogation protocol called 

1 5 HTTP (Hyper Text Transport Protocol). The Web browsers present in machines of the 
network 6 implement this protocol (and several others as well) and download the Web 
pages and the associated files from one server to another. 

[0026] The reference machine 2 contains means 7 for storing documents 8, 
reference files 8 in the example. The reference files 8 contain the information to be 

20 localized, expressed in a "pivot" language. The pivot language is the language capable of 
being the most widely known among translators: its utilization makes it possible to 
facilitate translations and avoid indirect translations (translation from German to English, 
then from English to Spanish, instead of a direct translation from English to Spanish if 
English is chosen as the pivot language). The translation machine 3 includes means 9 for 

25 storing translation files 1 0. The storage means 7 and 9 can be in any format, for example 
in the form of a hard disk or any other type of memory. 

[0027] The localization machine 4 contains a localization tool 1 1 in the form of a 
software module. The localization tool can use means 1 2 for storing the correspondence 
between the type of the document 8 and the markup language used, tags of said markup 

30 language and its grammar and syntax, as well as automatic transcription functions. The 
storage means 12 are contained in the localization machine 4 or linked to the latter. In the 
embodiment illustrated in Fig. 2, it is in the form of a hard disk. The localization machine 
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4 makes it possible to create localized files 13 from the reference file 8 and from the 
translation file 10. 

[0028] As shown in Fig. 2, the present invention also concerns a Web page editor 
14, for example an HTML page editor. The editor 14 is a software module contained in an 
5 editing machine 15 and offers any user facilities for writing a Web page. The editor 14 is 
connected to storage means 1 6 such as, for example, a hard disk. The editor is connected 
to a reference machine 2, itself connected to a network 6 of machines, such as for 
example the Internet. The reference machine 2, as in the embodiment illustrated in Fig. 1, 
is a server 2 for accessing the network 6. 
10 [0029] The system 1 according to the present invention works in the following 

way. 

[0030] The system 1 localizes the content of documents 8; in the 
internationalization system illustrated in Fig. 1 , the documents 8 are the reference files 8 
that contain the elements to be localized. 

15 [0031] The internationalization method according to the invention comprises a 

step for identifying the type of document 8 to be localized by means of the localization 
tool 1 1 . The type of the document, and of the reference file in the example illustrated, can 
be designated, as desired, by using the file name (and more precisely, its extension), a 
magic number stored in the file header, or a reference to a document that makes it 

20 possible to define the format of the document (like, for example, the DTDs, "Document 
Type Definitions," of the XML language). Thus, in the example of Annex 1, the type of 
the document is contained in the extension of the reference file 8, for example in the form 
of a suffix ".html" or ".htm". Then, the tags <HTML> ... </HTML>, which are 
characteristic of a Web document constructed with the HTML language, are retrieved. 

25 [0032] Depending on the document type, the localization tool selects the markup 

language to be used to read the document and detect the localization tags, as seen above. 
The correspondences between the file extensions, for example, and the markup languages 
to be used, are contained in the storage means 12. 

[0033] It is important to note that the tools that make it possible to enrich the 

30 markup pages with localization attributes, and the software programs that use them, the 
localization tool in the example illustrated, must use the same conventions for 
representing localization tags, and the same semantics associated with the various 
attributes of these tags. The tools for editing markup pages, and the software for 
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interpreting these pages must be able to be configured so as to support different forms of 
the same tag, as long as this form is unambiguous in the markup language used in the 
internationalized document. In the embodiments illustrated, the localization tool must be 
able to recognize the localization tag chosen for the language used. 
5 [0034] The method includes a step for identifying each element to be localized in 

the document 8 in question, i.e., in the reference file 8, by the localization attributes. The 
localization attributes include at least one type, which may be a default type and hence 
absent from the tag. They can also include, for example, an identifier, parameters, and 
specific attributes of the type, as will be seen below. For example, the message "My small 
10 business" to be localized into various languages is identified by the unique identifier "1"; 
it is typed by the type TEXT (the element to be localized is a text). 

[0035] The method according to the invention is based on the definition of tags 
designed to mark the identified elements to be localized using the localization attributes. 
[0036] The method, by means of the localization tool 1 1 contained in the 
1 5 localization machine 4, consists of detecting in the reference file 8 a tag dedicated to 
localization, retrieving the localization attribute or attributes associated with said tags, 
searching in the storage means 9 for the translation file 10 corresponding to the target 
language or culture, searching in the selected file 10 for said localization attributes and the 
localization values associated with the unique localization attributes obtained in the 
20 reference file 8, and replacing the tags of the reference file 8 with the corresponding 
localization values provided by the translation file 10. 

[0037] In the embodiment illustrated in Fig. 1, the method consists of localizing a 
Web page 8 contained in the Web page server 2. 

[0038] The tags dedicated to localization use the syntax and the grammar of the 
25 tags of the HTML/XML language. 

[0039] A summary of the characteristics of the HTML language, given below, will 
facilitate the understanding of the embodiment illustrated. 

[0040] HTML is a markup language. A tag indicates to a Web browser which 
elements represent text, headers, links, images, or any other element that may be present 
30 on a Web page. The tags have a constant form of the following type: a "<" character, a 
name, possible parameters (provided in the form parameter-name ' — " value-of-the- 
parameter), and at the end, a ">" character. The HTML language uses, for example, the 
following tags: <HTML>,<HEAD>, <TITLE>,<BODY>. Web browsers interpret the 
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name contained between the "greater than" and "lesser than" symbols: for example, the 
name HEAD indicates that the text contained between the HEAD tags is a window title; 
the browser displays the text in the title bar at the top of the screen of the machine in 
question. 

5 [0041] In general, tags work in pairs, but this is not always the case, as for 

example with the tag <P>, which alone indicates the start of a paragraph. For tags that 
work in pairs, the second tag differs from the first one by the character 7" in the second 
position. For example: <HTML> ... < /HTML>. The purpose of HTML tags is to 
formalize some of the aspects linked to the presentation and the structuring of the 

10 document, and to separate them from the content, the general objective being to have the 
same page content with different presentations; the presentations differ so as to adapt to 
the specific machine characteristics (monochrome or color screens), size of the screen, 
etc.) to the user's preferences (some of which impose the fonts and character sizes to be 
used for the titles, the texts, the code extracts, etc.), etc. 

15 [0042] Annex 1 shows an example of an HTML document. As the example 

shows, HTML handles the "special" characters with particular keywords (like the "e with 
a circumflex accent" in the example illustrated, with the keyword "&ecirc;" 

[0043] In the HTML language, there are tags for declaring links to other pages on 
the Internet, inclusions of images, video, etc. 

20 [0044] The HTML language is the subject of several standardization documents 

(essentially IETF and W3C). Providers of Web browser implementation accommodate 
these specifications, but add specific characteristics to them, in order to offer users more 
services and more capabilities for customizing Internet documents. This results in an 
incompatibility in representation from one browser to another. The following principle 

25 was therefore adopted: when a tag in an HTML document is not recognized by the 
browser, it is simply ignored and nothing is displayed. 

[0045] The example illustrated is based on pages written in HTML for essentially 
two reasons: 

the presence of tags that make it possible to isolate the internationalization 
30 information from the rest of the information, in particular display information 

and content, 

- the behavior of browsers faced with unknown tags offering debugging and 
delivery facilities, the reference file being able to be used as a standard 
8 
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provision for the pivot language of the application (the pivot language being 
the one with which the application works by default). 
[0046] The present invention can be applied to documents other than Web pages 
written in HTML, if said documents are formalized with a markup language and the 
5 syntax of same is known. 

[0047] The method according to the invention includes a step for defining tags. In 
the example illustrated, HTML/XML tags are chosen to be dedicated to the localization of 
markup/web page content. 

[0048] For example, the following tags are used: 
10 • for text messages: 

<LOC ID=message-identifier [TYPE=TEXT]> Default text (optional) expressed 
in the pivot language </LOC> 

[0049] The type TEXT is the default type; if no other type is mentioned, the type 
of the element to be localized is the type TEXT by default. 
1 5 [0050] The default text proposed is the one that can be used when a translation file 

is missing or when the content to be translated is absent from the translation file used for 
the target language. 

[0051] The default text makes it possible to do without a translation file for the 
pivot language. 
20 • [0052] For the date type fields: 

<LOC TYPE-DATE FORMAT=format> Date and/or time expressed in a neutral 
format (ex: A A A AMM J JHHMM S S ) parameterizable </LOC> 

[0053] The format specifies the meaning of the fields expressed in the value 
provided between the two tags. For example, to represent a time, the value of the 
25 FORMAT field is: "HHMM" or "HHMMSS". This format doesn't have much to do with 
what will actually be displayed (for example: "19:28:30" or "19h28m30s"), but it makes it 
possible to give a meaning to the value to be transcribed. 
• [0054] For the number type fields: 

<LOC TYPE=NUM FORMAT=format>Number defined in a neutral format (ex: 
30 [+|-|]AAA[.BBB][e[+|-j]CCC]) parameterizable </LOC> 

[0055] The format specifies the meaning of the fields expressed in the value 
provided between the two tags. For example, to represent an integer, the value of the 
FORMAT field is: "[+|-|]AAA". 
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• -[0056]' For the currency type fields: 

<LOC TYPE=CUR>Number defined in a neutral format (ex: [+|-|]AAA[.BBB]) 
parameterizable </LOC> 

• [0057] For the image type fields (icons, etc.): 

5 <LOC ID=message-identifier TYPE=NUM> Default path (optional), 

corresponding to the pivot language </LOC> 

[0058] In the present example, the tags are not dedicated to a particular language. 
Their syntax, on the other hand, is that of the HTML and XML languages, thus making it 
possible to cover a large document folder. The method is applicable to the parent 

10 language of these two languages, i.e., the language SGML. The choice of the name of the 
tag must be configurable: it is essential that there not be any collision with tags that are 
already defined in the language in which the localization tags must be inserted. 

[0059] The utilization of the keyword LOC as a tag identifier is proposed as an 
example. This keyword could be replaced by another keyword (LOCAL, 

15 LOCALIZATION, etc.) in other markup languages that are already using this keyword. 
The choice of this keyword should be made so that it is unique and unambiguous in the 
markup language used, and so that it is recognized by the various tools manipulating the 
markup pages (the page editors and the tool 1 1, in particular). 

[0060] The presence of a unique identifier of content to be localized associated 

20 with each localization tag can be made optional for certain types of data. For digital type 
data, for example, it is possible to use automatic transcription functions, such as for 
example standard localization functions (provided in the form of programs, store in the 
storage means 12 of the tool 1 1) that make it possible to automate the reformatting of the 
information in a given language or culture from a pivot data format. For example, in the 

25 particular case of English-speaking cultures, these programs make it possible to receive as 
input a numeric value, and to produce as output a display representation that 
systematically includes a comma for separating the figures into thousands. They make it 
possible to automate certain translation tasks, and in particular the rendition of numeric 
values; they avoid a write operation in the translation files. On the other hand, the 

30 presence of a unique content identifier is mandatory for textual content, since it is the 

search key that will be used to find the localized message in the translation files. This key 
is justified by the fact that the translation of textual content cannot currently be automated 

in a completely reliable way, and hence, it is not possible to do without translation files 

10 
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specially formatted to reflect the content of the page to be translated. 

[0061] Tags as defined above for localization have been inserted into the content 
of the page illustrated in Annex 1 in order to allow said page to be localized: the localized 
page appears in two different ways in Annexes 2 and 3 . Thus, for example, the text 
5 message "My small business" is designated by the following localization tag: 
<LOC ID=l>My small business. . .</LOC> 

[0062] The localization attributes are the following: The identifier designated by 
the tag is "1". The default text expressed in the pivot language, in this case English, is 
"My small business". The type is not expressed; it is a default type, i.e., the type TEXT. 
10 [0063] Annex 2 gives only text messages as examples, i.e. tags of the following 

type: 

<LOC ID=. ..>... </LOC> 

[0064] The content of the page in Annex 3 is simpler but requires the translation 
file, such as the English translation file in Fig. 4, that provides the value to be associated 

15 with each of the identifiers named in the document. In the example of Annex 3, the page 
cannot be directly displayed in the pivot language: it must move into the localization tool 
that allows it to be translated. 

[0065] It is possible to define particular types for information sources not provided 
by the HTML language that are capable of being localized. For example, in certain 

20 countries or in certain working environments, a clanking sound is generated whenever an 
error occurs. The type of sound emitted when a particular even occurs differs depending 
on the countries and the customs of each. It is therefore possible to offer an additional 
type "SOUND" that makes it possible to handle this type of situation. Another example 
concerns the color conventions used by certain culture to express certain concepts or 

25 certain events: abundance or wealth may be represented by yellow or red, and mourning 
may be represented by black, white or red. It is possible to offer a "COLOR" tag that 
includes a "CONCEPT" attribute and a "VALUE" attribute for representing this type of 
situation. 

[0066] The localization tool 1 1 detects the localization tags and the localization 
30 attributes associated with said tags, searches in the storage means 9 for the translation file 
10 corresponding to the target language or culture, then searches in the selected file 10 for 
the localization attributes and the localization values associated with said unique 
localization attributes obtained in the reference file 8. 
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[0067] The method also consists of defining said translation file 10. The format of 
the translation file 10 is not very important: it depends on the tool 1 1 loaded to perform 
the localization of the documents. The translation file 10 includes one or more unique 
localization attributes, and in most cases, as seen above, a unique identifier or identifiers 
5 associated with a localized value that corresponds to the identifier for a given language. 
Annex 4 shows an example of a translation file 10 capable of being associated with the 
reference file 8 of Annexes 2 and 3 in order to display its content in English. The 
translation file 10 constitutes the content model; a content model is richer than a structure 
model: the content model specifies more than just the position of the titles and the 

10 paragraphs (which in general are designated by the "structure" of the document). The 
content model also indicates which information is to be provided, and within it, which 
information is to be localized (with the associated localization parameters). 

[0068] In the example illustrated, the localization tool of the localization machine 
produces a web page from the reference file 8 and the translation file 10. To do this, the 

1 5 localization tool 1 1 takes the reference file 8 and replaces the localization tags of said 
reference file 8 with the localized values of the identifiers of said tags given by the 
appropriate translation file 10. 

[0069] The tags can delimit messages that contain parameters; in general, the 
parameters are raw data to be displayed as is. The software programs must be able to 

20 handle this type of situation, such as for example error messages of the following type: 
"Error No. 1001 : the file C:\COMMAND.COM does not exist." This error message 
includes two parameters. The order of appearance of the parameters is important. It is not 
possible to divide this message, concatenating the following segments of it: 

• <LOC NUM= 1 >Error No. </LOC> 
25 • <LOCNUM=2>100K/LOC> 

• <LOC NUM=3> : the file </LOC> 

• <LOC NUM=4>C:\COMMAND.COM</LOC> 

• <LOC NUM=5> does not exist.</LOC>. 

[0070] In fact, certain languages do not translate this message in the same way; 
30 they may not accept, or may delete, one of these five message, or even change the order of 
the parameters or messages. In English, for example, the message may be transformed in 
the following way: 
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• <LOCNUM=l>ErrorNo. </LOC> 

• <LOC NUM=2> 1001 </LOC> 

• <LOC NUM=3>C:\COMMAND.COM</LOC> 

• <LOC NUM=4> file does not exist.</LOC> 

5 [0071] The method according to the invention therefore consists of numbering the 

parameters of the messages PARAM 1, PARAM2, etc., and of inserting labels, for 
example such as "%number", into said messages. The invention handles the preceding 
case in the following way: 

<LOC NUM=1 PARAM 1 ="1001" PARAM2="C:\COMMAND.COM"> Error 
10 No.%1 : the file %2 does not exist </LOC> 

[0072] In the example of Annexes 2 and 3, the site Quincaillerie.com is a 
parameter since it does not change no matter what the language, the country or the culture. 
The tag numbers this first parameter: PARAM 1 and identifies it with the label %1. This 
information is stored in the reference file. It will be used by the localization tool loaded to 
1 5 re-read the reference file and the files containing the localized messages in order to 
constitute the final localized document. 

[0073] It is common for portions of codes in HTML to be generated dynamically 
on the client end, in the browser, thanks to portions of code written in languages like 
JavaScript and embedded into the main HTML code. 
20 [0074] In order to allow the dynamically generated HTML content to be localized, 

the method according to the invention consists of: 

• implementing the localization tool 1 1 in the dynamic code generation language, for 
example in JavaScript; 

• including the loading of the code of the corresponding JavaScript localization tool 1 1 
25 in the main HTML web page (the one that generates HTML code on the fly in the 

client); 

• having the JavaScript localization tool 1 1 load the translation files 10 required for the 
localization of the HTML code generated in the client; 

• making use of the JavaScript localization tool 1 1 as the HTML code is generated. 
30 [0075] Instead of using a JavaScript version of the localization tool 11, the 

designer of the Internet document can use CGIs (Common Gateway Interfaces). The CGI 
components are located in the server 2 and make it possible to execute actions, interrogate 
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databases, etc. they are supposed to generate HTML pages; the CGI standard is an Internet 
standard. The CGI components are capable of performing the necessary localization 
operations, by sending this CGI a variable that gives an indication of the target language 
offered to the user. 

5 [0076] The embodiments of the method according to the present invention are 

quite varied. According to one embodiment, which fits into the context of a software 
development process, the step for creating localized files is done "in the factory" prior to 
the storage of all the files of the application on CD-ROM, in which case the localized files 
are made available to the producer before the burning of the CD-ROM and the localized 
10 files are delivered (with or without the reference files) directly on the CD-ROM. This 
embodiment avoids the need to deliver, document, and maintain the localization tool 1 1 
for third parties. 

[0077] The localization tool 1 1 may be delivered (with the reference files 8) to 

third parties so that they themselves can expand the number of languages supported. This 
1 5 requires a documentation of the reference files 8 in order to facilitate the creation of 

translation files 1 0, and the establishment of a structure capable of responding to 

questions from these third parties. 

[0078] Another embodiment consists of delivering the localization tool 1 1 and the 

reference files 8, the localized files being created upon installation of the software (which 
20 saves space on the CD-ROM of said software), on request (sometime after installation) or 

"on the fly" (i.e., during execution, as explained above in connection with JavaScript 

solutions embedded into web pages, or the use of CGI processes. 

[0079] Another embodiment of the invention concerns MP3 CD-ROM readers 

into which XML files are written. The XML files contain information on the on the titles 
25 stored on the CD-ROM, the words associated with each of the titles, and the MP3 

encoding of the titles in question. The XML files constitute reference files 8. When the 

CD-ROM reader reads the XML files, it is connected to a localization tool and translation 

files that make it possible to create localized XML files based on the country in which one 

is located. 

30 [0080] The method according to the invention is also capable of being 

implemented in the Web page editor 12. The editor 12, each time a user enters content to 
be internationalized (a text, in particular), associates a unique identifier with said content, 
proposes the entry of a default value of the content to be internationalized, and proposes 
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the entry of various values assumed by this content in the various target languages of the 
document being edited. The editor creates the reference file 8 and the associated 
translation files 10, and stores them in storage means 16. The editor offers: 

• ergonomic entry of these various messages for the various target languages of the 
5 document; 

• ease in storing and creating translation files 10 that are readable by translators that do 
not have the tool for editing the content to be localized; 

• easy creation of localized content from reference files 8 and translation files 10. 

[0081] Unlike equivalent editors that might exist for creating particular 

10 documents, this editor would have to store localization attributes associated with the 
document element to be localized: the type of element localized (text, a number, a 
monetary value, an icon, a sound, a color, etc.), the parameters associated with this type, 
the parameters associated with the message to be localized (for messages that are partly 
fixed and partly variable). 

15 [0082] One advantage of the present invention is the behavior of browsers faced 

with unknown tags, and in the present invention the tags dedicated to localization: as seen 
above, when the browser encounters such tags, it ignores them. Thus, the reference file 
containing the original page expressed in the pivot language may be used as is, without its 
being necessary to pass it through the localization tool 1 1 (with a few exceptions, 

20 particularly related to messages with parameters). 

[0083] Another advantage is the existence of markup language syntax analyzers; 
they are numerous, space-efficient and very easy to use. The design of the localization 
tool could be based on an embodiment in existing syntax analyzers. 

[0084]The present invention concerns the method for internationalizing the 

25 content of markup documents 8 that consists of: 

• detecting a tag dedicated to the localization of the document 8, the localization 
attribute or attributes, and possibly a default localization value associated with said 
tag by means of the localization tool 1 1 ; 

• searching, if necessary, in the storage means 9 in the translation file 10, for the 

30 localized value of the element associated with this or these localization attribute(s); 
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• replacing the tag in the document 8 with the localized value found in the translation 
file 1 0, or with the default localization value, or with a value obtained via automatic 
transcription functions. 

[0085] The method consists of searching for the type of the document 8 in order to 
5 recognize the tags used in said document and their grammar and syntax, and performing a 
detection of the tags dedicated to localization. 

[0086] The method consists of using as localization attributes a unique identifier, 
an element type, and possibly parameters and/or specific attributes of the type. 

[0087] The tag dedicated to localization assumes the formalism of a markup 
10 language. 

[0088] The method consists of using tags that are not provided in the markup 
language used for localization purposes. 

[0089] The method consists of creating, prior to the detection, the translation file 
10 that includes the localization attribute or attributes of the element or elements to be 
1 5 localized, associated with the corresponding localized value of the localization attribute or 
attributes in a given language. 

[0090] Prior to the detection, the localization tool 1 1 is implemented in a dynamic 
code generation language, and the code of the tool 1 1 is loaded into the document 8, 
which dynamically generates its own code, the replacement of the tags taking place as the 
20 code of the document 8 is generated dynamically. 

[0091] The present invention relates to the system for implementing the method 
described above, characterized in that it includes the localization tool 1 1 and the means 9 
for storing the translation file. The present invention also relates to a system for 
internationalizing the content of markup documents 8, comprising: 
25 • the means 7 for storing markup documents 8; 

• the means 9 for storing the translation files 10 of the documents 8; 

• the localization tool 1 1 connected to said storage means 7, 9 and allowing the content 
of the document 8 to be localized using the translation file. 

The localization tool 1 1 is implemented in a dynamic code generation language, and 
30 the code of the tool 1 1 is loaded into the document 8, which dynamically generates its 
own code. 

[0092] The localization tool 1 1 is a CGI component. 
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[0093] " The present invention concerns a method for editing and internationalizing 
markup documents 8 that consists, each time during the editing of the document 8 that a 
user enters content to be internationalized, of associating the localization attribute or 
attributes with said content, proposing the entry of a default value of the content to be 
5 internationalized, and proposing the entry of all or some of the various values assumed by 
this content in the various target languages of the document being edited, of creating the 
document 8 and the associated translation files 10 from information obtained from the 
user, and storing said files in the storage means 16. 

[0094] The present invention concerns an editing and internationalization system 
10 comprising the editor 14 in the machine 15 for editing markup documents 8, which makes 
it possible to create reference files and associated translation files from information 
obtained from the user and store them in the storage means 16. 

[0095] The present invention concerns the method for internationalizing the 
content of markup documents 8, which consists of: 
15 • Defining tags dedicated to localization; 

• Identifying the information to be localized in the document 8 by means of one 

or more localization attributes; 
[0096] Associating the localization tags with the localization attributes in the 
document 8 in order to allow its localization. 
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ANNEX 1 


<HTML> 
<HEAD> 
<HEAD> 

<TITLE> 

Quincaillerie.com 

</TITLE> 

<BODY> 

<H 1 ><CENTER>My small business...</CENTER></Hl> 

Vous &ecirc;tes sur le site <B>Quincaillerie.com</B><P> 
[You are on the site <B>Quincaillerie.com</B><P>] 

<H2>Small equipment</H2> 
<UL> 

<LI>Nuts</LI> 

<LI>Bolts</LI> 
<UL> 

<H2>Household appliances</H2> 
<UL> 

<LI> Washing machine</LI> 
<LI>Dishwasher</LI> 
<UL> 

<H2>My partners</H2> 
<UL> 

<LI><AHREF="http://www.bullsoft.com">BullSoft</A></LI> 
<UL> 
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ANNEX 2 


<HTML> 
<HEAD> 
5 <HEAD> 


<TITLE> 

Quincaillerie.com 

</TITLE> 

<BODY> 

<Hl><CENTER><LOC ID=l>My small 
business. . .</LOC></CENTER></H 1 > 


15 <LOC ID=2 PARAM1- ' <B>Quincaillerie.com</B>">Vous &ecirc;tes sur le 

site de % 1 </LOC><P> 

[<LOC ID=2 PAPvAMl="<B>Quincaillerie.com</B>">You are on the site 
%1</LOO<P>] 

20 <H2><LOC ID=3>Small equipment</LOC></H2> 

<UL> 

<LlxLOC ID=4>Nuts</LOC></LI> 
<LI><LOC ID=5>Bolts</LOC></LI> 
</UL> 

25 <H2><LOC ID=6>Small equipment</LOC></H2> 

<UL> 

<LI><LOC ID=7> Washing machine</LOC></LI> 
<LI><LOC ID=8> Washing machine</LOC></LI> 
<AJL> 

30 

<H2><LOC ID=9>My partners</LOC></H2> 
<UL> 

<LI><AHREF="http://www.bullsoft.com">BullSoft</A></LI> 
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</UL> 


5 <BODY> 
<HTML> 
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ANNEX 3 

<HTML> 
<HEAD> 
<HEAD> 

<TITLE> 

Quincaillerie.com 

</TITLE> 

<BODY> 

<Hl><CENTER><LOC ID=1> </LOC></CENTER></Hl> 

<LOC ID=2 PARAMl="<B>Quincaillerie.com</B>"> </LOC><P> 

<H2><LOC ID=3> </LOC></H2> 
<UL> 

<LI><LOC ID=4> </LOC></LI> 
<LI><LOC ID=5> </LOC></LI> 
</UL> 

<H2><LOC ID=6> </LOC></H2> 
<UL> 

<LI><LOC ID=7> </LOC></LI> 
<LI><LOC ID=8> </LOC></LI> 
</UL> 

<H2><LOC ID=9> </LOC></H2> 
<UL> 

<LI><A HREF="http ://www.bullsoft.com">BullSoft</A></LI> 
</UL> 
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<BODY> 
</HTML> 
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ANNEX 4 


1 My small business... 

2 You are on the site Quincaillerie.com 

3 Small equipment 

4 Household appliances 

5 ... 
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